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Abstract. It is shown that certain ensembles of random matrices with entries that vanish 
outside a band around the diagonal satisfy a localization condition on the resolvent which 
guarantees that eigenvectors have strong overlap with a vanishing fraction of standard basis 
vectors, provided the band width W raised to a power /i remains smaller than the matrix size 
N. For a Gaussian band ensemble, with matrix elements given by i.i.d. centered Gaussians 
within a band of width W, the estimate /i < 8 holds. 



1. Introduction 

Random band matrices, with entries that vanish outside a band of width W around the 
diagonal, have been suggested ^ i8j as a model to study the crossover between a strongly 
disordered "insulating" regime, with localized eigenfunctions and weak eigenvalue correla- 
tions, and a weakly disordered "metallic" regime, with extend eigenfunctions and strong 
eigenvalue repulsion. Such a crossover is believed to occur in the spectra of certain random 
partial differential (or difference) operators as the spectral parameter (energy) is changed. 

In this paper, the strong disorder side of the band matrix crossover is analyzed. It is shown 
here that certain ensembles of random matrices whose entries vanish in a band of width W 
around the diagonal satisfy a localization condition in the limit that the size of the matrix 

tends to infinity provided W^/N —>■ 0. This result requires certain assumptions on the 
distribution of the entries of the matrix, and the proof given here has technical requirements 
that may not be necessary. Nonetheless, the conditions imposed below (see Q allow for a 
large family of interesting examples. In particular, one may consider a Gaussian distributed 
band matrix, with distribution 

(1.1) e-2^*^^-^-dXvK;^ 

where dX^-N^ the Lebesgue measure on the vector space of N x N matrices of band width 
W. That is' 

di,i ai^2 • • • cii,w \ 
all '^2,2 



aw,i 
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AfxAf 



(1.2) 



X, 



W;N 
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with di and j independent families of i.i.d. real and complex unit Gaussian variables, 
respectively. 

The main result obtained here is a localization estimate for the eigenvectors of the matrices 
Xw-N- This localization result is most conveniently stated in terms of the resolvent {Xw-n — 
A)~^, a well defined random matrix for A G M. (We will see that A is an eigenvalue of Xw,n 
with probability zero.) Let denote the standard basis vectors = Sij. Then 

Theorem 1. // Xpi/.jv has distribution fll.ip . or more generally a distribution satisfying 
assumptions 1, 2, and 3 in ^ below, then there exists fi > and a < oo such that given 
r > and s G (0, 1) there are Ag < oo and > such that 

(1.3) E{\{e„{Xw;N-Xr'e^)\') < A,W^^'^e-"»W 

for all A G [— r, r] and all i, j = 1, . . . , N . For the Gaussian band ensemble (11.11) cr < ^ and 
12 < 8. 

Remarks: For the Gaussian Band Ensemble (11.11) . the density of states, in the regime W,N—>- 
oo, W/N —>■ 0, is known to be the Wigner semi-circle law (see ^ below). For A outside the 
support of the semi-circle law, one could obtain (11.31) with /i = 1 using Lifschitz tail type 
estimates. This will be dealt with in a separate paper. 

Theorem [T] estimates the decay of matrix elements of the resolvent away from the diagonal. 
Using techniques developed in the context of discrete random Schrodinger operators one may 
obtain from (11.31) estimates on eigenvectors. 

Theorem 2 (Eigenvector localization). Let Xw,n have distribution ( 11. ip . or more generally 
a distribution satisfying assumptions 4 and 5 m 33 

(1) With probability one all eigenvalues of Xw-jy are simple. 

(2) // (11.31) holds for all A in an interval [— r, r] and if Xk, k = 1, . . . ,N, are the eigen- 
values of Xw,N with corresponding eigenvectors Vk, k = 1, . . . ,N, then there are 
B < oo, T > 0, and P > 

(1.4) e( sup \M^)Mj)\] < BW^e-^y^ 
for all i, j = 1, . . . , N . 

Remark. For the proof of this theorem, the reader is directed to the corresponding result in 
the context of random Schroedinger operators. See for example |T2] for the non- degeneracy 
of the eigenvalues and [21 Theorem A.l] for a derivation of (ll.4p from Green's function decay 
(II. 3p . In both cases, the proof involves only averaging over the coupling of a rank one 
perturbation and can be applied in the present context. 

1.1. Sketch of the Proof. The proof of Theorem [1] is based on two observations, which 
may be summarized as follows^ Let Gw,N{hj) = (^i, {Xw-j^ — X)~^ej). Then 

(1) The random variable Gw^N^hj) is rarely large. This may be expressed through a 
bound (uniform in A^) on the tails of the distribution of Gw;N{hj) 

^The idea to study localization via these two complementary estimates was suggested in the context of 
random Schroedinger operators by Michael Aizenman, and is inspired by the Dobrushin-Shlosman proof |10| 
of the Mermin- Wagner Theorem [T7] on the absence of continuous symmetry breaking in classical statistical 
mechanics of dimension 2. 
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Lemma W. If Xy/.j^ has distribution (11.11) . or more generally a distribution with 
the properties outlined in ^ below, then there exist k > and a < oo such that 

(1.5) PT0h{\Gw;N{hj)\>t)<f^^. 

(2) The fluctuations of lia\Gw,N{h j)\ grow at least linearly with \i — j\. One would 
typically express the growth of fluctuations by an inequality like 

Vai (In \G w,N {i, > const. \i — j\, 

where Var(X) = E(X^) — E(X)^ is the variance of a random variable X. However 
for present purposes a more convenient quantitative expression of this idea is the 
following 

Lemma F. If Xyy.^ has distribution (11.11) . or more generally a distribution with the 
properties outlined in 33 below, then there is u > such that ifO<r<s<l and 
\i — j\ > SW then 

(1.6) E{\GwAhm < expi-Cr,sW-^\i - mi\GwAhj)n^' 
with Gr^s > 0. For the Gaussian band ensemble (11.11) < 8. 

Lemmas W and F together easily imply Theorem [H Indeed, it suffices to show that the 
second factor on the right hand side of (11.61) is uniformly bounded. But it follows from 
Lemma W that 

(1.7) Ei\GwAhm<-^W'\ 

1 — s 

This observation, which is the basis of the fractional moment analysis of random Schrodinger 
operators [3l|T[|2], follows easily from (II. 5p since 



(1.8) 



POO 

E{\GwAhj)n = s / FToh{\GwAhj)\>t)t'-'dt, 

Jo 



and probabilities are bounded by one. 

It may not be immediately clear what Lemma F has to do with large fluctuations. Towards 
understanding this, let X = hi \Gw,N{'i, By the Holder inequality, 

(1.9) E(e"^) < E(e"^)i 

Furthermore, equality holds only if X is non random — if there is Xq G M so that X = xq 
almost surely. In other words 

(1.10) E(e''^) =e-''('^'^)E(e^^)^ 

with h{r, s) > unless X is non random. 

If X were Gaussian with variance (and arbitrary mean), then h{r, s) would be propor- 
tional to the variance 

(1.11) h{r,s) = 

For a general random variable X, the associated quantity h{r, s) may be taken as a measure 
of the fluctuations of X. In place of (II. lip , we have the following identity for h in terms of 
the variance of X in weighted ensembles: 
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Proposition 3. Let X be a random variable with E (e*"^) < oo for some s > 0. Ifr G (0, s), 
then E (e^"^) < oo and 

1 r 

(1.12) h{r,s) = - min(r, g) (s — max(r, g)) Varq(X)dg, 

"S Jo 

where h{r, s) is defined by (11.101) and 

, , , , E(XV^) /E(Xe^^)V 

1-13 Var„ X = I, ^/ - ^\ ^' 

^ ' ^ E(e9^) y E(e5^) j 

is the variance ofX with respect to the weighted probability measure Prob^(A) = E [xAe"^) /E (e 

Proof. Holder's inequality is the statement that the function $(r) = lnE(e'''^) is convex. In 
particular, if s > then 

(1.14) $(r) < -<l>(s) 

s 

for r e (0, s), since $(1) = InE (1) = 0. If E(e'"^) < oo, it follows that $ is bounded on [0, s]. 

The identity (11.121) follows from Taylor's formula with remainder. Indeed, the second 
derivative of $ at r is equal to the weighted variance Var^(X). Thus, 

(1.15) $(s) = <l>(r) + $'(r)(s - r) + ^ (s - g) Varg(X)dg, and 

/•r 

(1.16) = $(0) = $(r) - <l>'(r)r + / qVaig{X)dq. 

Jo 

Taking a convex combination of these identities, chosen so the first order terms cancel, gives 

(1.17) -<^{s) = <l>(r)+ r - Var,(X)dg + f ' 1^ Var,(X)dg 

S Jo ^ Jr ^ 

= $(r) H — / min(r, g)(s — max(r, g)) Varg(X)dg, 



S JO 

which is equivalent to (I1.12p . □ 

Thus Lemma F may be understood as giving a lower bound on the fluctuations of X = 
In \Gw,N{h cLS measured by the improvement to Holder's inequality. The proof of this 
result will be accomplished using a product formula for Gw,N{hj) that expresses this quan- 
tity as a matrix element of a product of 0{\i — j\/W) matrices of size W x W. Prop. ([3]) 
will be applied to factors in this product, with each factor contributing a term of size 
to h{r, s). Since there are 0{\i — j\/W) terms, this produces the claimed decay. 

The strategy taken below in proving Lemmas W and F has two parts. First we identify 
certain axioms for the distribution of X\y-n which lead naturally to the lemmas. Second, we 
verify that the Gaussian band ensemble (II. ip satisfies these axioms. To motivate the form 
of the axioms for the distribution of Xw,n, we begin in [|2]with a self contained sketch of the 
argument in the tri-diagonal case W = 2. In ^we state the assumptions needed to adapt 
the proof toW>2, state the associated general results and prove Lemma W. In §l]we get 
to the heart of the matter and prove Lemma F. In ^ we discuss examples of ensembles, 
including the Gaussian band ensemble (II. ID . satisfying the axioms of §31 In an appendix, an 
elementary probability lemma used below is stated and proved. 
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1.2. Remarks on the literature and open problems. In [71 E] it was observed, based 
on numerical evidence, that the localization of eigenfunctions and eigenvalue statistics of the 
Gaussian band ensemble (11. ip are essentially determined by the parameter W'^/N. When 
W'^/N << 1 the eigenfunctions are strongly localized and the eigenvalue process is close to 
a Poisson process. When W'^/N >> 1 the eigenfunctions are extended and the eigenvalue 
statistics are well described by the Gaussian unitary ensemble (GUE). A theoretical physics 
explanation of these numerical results was given by Fyodorov and Mirlin [13]. They consid- 
ered a slightly different ensemble in which a full GUE matrix is modified by multiplying each 
element by a factor which decays exponentially in the distance from the diagonal. For this 
model, on the basis of super-symmetric functional integrals, they obtain an effective cr-model 
approximation which, at the level of saddle point analysis, shows a localization/delocalization 
transition at W ^ \/iV. 

Theorem [T] is consistent with the above picture. However, (TJ El [13] suggest that proper 
exponent on the r.h.s. of (11. 3p would be /i = 2. 

Problem 1. What is the optimal value of in (II. Sp ? In particular, does this equation hold 
with fi = 2? 

In the physics literature, the nature of eigenvalue processes in the large N limit is generally 
expected to be related to localization properties of the eigenfunctions, with Poisson statis- 
tics corresponding to localized eigenfunctions and Wigner-Dyson statistics corresponding to 
extended eigenfunctions. Let us call this idea the "statistics/localization diagnostic." (In 
the context of band random matrices, a vector v is a function on the index set {1, . . . , N}, 
namely v(z) = i^^ coordinate of v. The statistics/localization diagnostic suggests that the 
eigenvalues of a random matrix should be approximately uncorrelated if a typical eigenvec- 
tor is essentially supported on a vanishing fraction of {1, . . . ,N}, and should show strong 
correlations if it is typically spread over more or less the entire index set.) 

The extreme cases W = 1 and W = N of the Gaussian band ensemble (11.11) are consistent 
the statistics/localization diagnostic. Indeed, with W = 1, the matrix is diagonal and the 
eigenvalues, which are just the diagonal entries djj, are independent. After suitable rescaling 
the eigenvalue process converges to a Poisson process in the large limit. (This is essentially 
the definition of a Poisson process.) Likewise the eigenfunctions are the elementary basis 
vectors Gi{j) = which are localized on single sites. On the other hand, with W = N 

the matrix Xw-n is sampled from the GUE. In this case, the eigenfunctions together form 
a uniformly distributed orthonormal frame, so they are completely extended, and a suitable 
rescaling of the eigenvalue process converges in distribution to an explicit determinental 
point process as calculated by Dyson [T^ ITT]. 

Based on the statistics/localization diagnostic, it is reasonable to conjecture that Poisson 
statistics hold for local fluctuations of the eigenvalues of Xw,n in a limit N ^ oo with W = 
W{N) — >• oo provided W{N)^/N — > 0. (One must be a little careful with the diagnostic, 
as it is easy to concoct random matrices with totally extended eigenfunctions and arbitrary 
statistics: put A^ random numbers with any given joint distribution on the diagonal of a 
matrix and conjugate the result with a random unitary! Of course, in that ensemble the 
matrix elements will most likely be highly correlated. Thus, it remains plausible that the 
statistics/localization diagnostic is correct, at least, for matrices with independent matrix 
elements.) 

For random Schrodinger operators, Minami has derived Poisson statistics for the local 
correlations of the eigenvalue process from exponential decay of the resolvent [18]. Some 
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aspects of Minami's proof translate to the present context. Most notably, the so-called 
Minami estimate which bounds the probability of having two eigenvalues in a small interval, 

(1.18) ^ Prob [#{A, G /} > 2] < Cw\I\', 

where |/| is the length of the interval and Ai < ■ • ■ < Aat are the eigenvalues of Xw-n, holds 
with 

(1.19) Cw oc 1^2". 

Here a is as in Thm. [1] (The proof of this fact may be accomplished by following Minami's 
argument or by one of the various alternatives that have appeared recently in the literature 

[a Eli.) 

However, one crucial ingredient is missing: we lack sufficient control on the convergence 
of the density of states. The density of states of X^y.jy is the measure hiw.^{\)d\ on the real 
line giving the density of the eigenvalue process: 

(1.20) j^Kw;NWd^ = ^E(#{A, e/}). 

As indicated, Kvi/.7v(A)dA is absolutely continuous. In fact, it follows from the Wegner esti- 
mate — (13. 8p below — that 

(1.21) \f^W;NW\ ^ W% 

SO analogous to fll.lSp we have 

(1.22) lprob[#{A, G /} > 1] < ^E(#{A, G /}) = j^f^w;NW^ < ^^1^1- 

(In fact, the Minami estimate is proved in a similar way, by showing that the expected number 
of eigenvalue pairs in I is bounded by the r.h.s. of fll.lSp .) 

To study local fluctuations of the eigenvalue processes near Aq G M, it is natural to consider 
the re-centered and re-scaled process 

(1.23) A, = iV(A,-Ao), 

which has mean spacing 0(1). We say that the eigenvalue process has Poisson statistics 
near Aq, in some hmit W = W{N) and — > oo, if the point process {Xj} converges to 
a Poisson process. The density of this Poisson process would then be given by the limit 
limAT^oo i^w(Ny,N{Xo). The difficulty is we do not know that this limit exists. 

Now, for a fairly general class of matrix ensembles with independent centered entries, e.g., 
for the Gaussian ensemble fll.ll) . it is known that the density of states kw,n converges weakly 
to the semi-circle law, provided W{N)/N — > or 1 (see [IS]). That is, 

(1.24) ^E{tTf{XwiNy,N)) = [ f{\)^w;NW\ :^ f f{t)y/A^dt. 
However, as indicated this is a weak convergence result, and it does not follow that 

(1.25) «:ty(7v);7v(A) ^Vl^I[\\\ < 2], 



or even that 

1.26) / Kw-N^dX ^Va^I[\\\ < 2], 

i(Ao-^,Ao+A) 27r 
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which would in fact be sufficient to control the density of the putative limit process. 
In this regard, let us state a couple of open problems. 

Problem 2. Improve the estimate fll.2ip . In particular, does this bound hold with a = 0? 
(The interpretation of kw-nW/N as the mean eigenvalue spacing and the convergence (11.241) 
suggests that k should be bounded.) 

Problem 3. Verify either (I1.25P or (11.261) . 

Acknowledgments. I would like to thank Tom Spencer and Michael Aizenman for many 
interesting discussions related to this and other works, and to express my gratitude for the 
hospitality extended me by the Institute for Advanced study, where I was member when this 
project started, and more recently by the Isaac Newton Institute during my stay associated 
with the program Mathematics and Physics of Anderson Localization: 50 Years After. 



2. Tridiagonal matrices 



The aim of this section is to motivate the assumptions on the distribution of Xw,n, spelled 
out below in ^ by examining separately the somewhat simpler case W = 2. Thus, consider 
for each G N, a random tridiagonal matrix 

t% 



(2.1) 



X 



2\N 



t2 



\ 



VN-1 
''iV-1 



vn ) 

with vi, ^2, . . . and ti, ^2, • • • two given mutually independent sequences of independent ran- 
dom variables, real and complex valued respectively. For such matrices, exponential decay 
of the Green's function and localization of eigenfunctions can be obtained by the transfer 
matrix approach, see [6]. Here we use a different method, which is closely related to the 
technique of Kunz and Souillard [T6] . 

As discussed above, the central technical estimate is a bound on ]E(|(ej, {X2;N 
decaying exponentially in the distance |i — j|. To obtain this bound, it is convenient to 
assume that (f^) are identically distributed and likewise (t^)- (This assumption could be 
replaced by uniformity in k of certain bounds assumed below. Likewise, strict independence 
of (ffc) is not really the issue. The argument could easily be adapted to the situation in which 
(vfc) are generated by a distribution with finite range coupling, such as HfePl^fc — '^^fc-i)dffc.) 
The distribution of (t^) can be arbitrary — theses variables may even be deterministic as 
in the case of random Jacobi matrices. 

To facilitate the fiuctuation argument proposed above we will suppose the common dis- 
tribution of f has a density p with the following property: 

Definition 1. We say that a probability density p on M is fluctuation regular if there are 
constants e, 5 > and measurable set i7 C M with Jj^ p(t>)df > such that 



(2.2) 



> 6 for all f 1, V2 E {v — e,v + e) 
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Remark. A sufficient condition for fluctuation regularity is that Inp is Lipschitz on some 
open interval. For example a uniform distribution p{x) oc Xla,b]{x) is fluctuation regular. So 
are the Gaussian and Cauchy distributions. However, fluctuation regularity is quite a bit 
stronger than absolute continuity of the measure pdx, since it implies the existence of an 
open set on which p is strictly positive. 

Our goal in this section is to prove the following result: 

Theorem 4. Let {vk)'^^i and be two sequences of i.i.d. random variables, real and 

complex valued respectively. Suppose that the common distribution of has a density p 
which is bounded and fluctuation regular. Then for < s < 1 and A > there are Ag < oo 
and ps^\ > such that for all A G [—A, A], 

(2.3) E{\{e.UX2;N-Xr'e,)\') < A^e-^-^l-il 

Remark. We restrict A to a compact set to facilitate the fluctuation argument below. In 
fact, for large |A| the rate of exponential decay will improve, although the mechanism will be 
somewhat different. One could construct a proof in this context along the lines of [5]. Thus 
the A dependence of the mass of decay Ps-A may be dropped. 

Let g]\f{i,j;\) = {ei,{X2-N — Recall that the decay of K {\g]\r{i, j; \)\'^) was to 

be established in two steps, the first of those being Lemma W which gives finiteness of the 
fractional moments. A preliminary observation is that Lemma W holds for these tridiagonal 
matrices: 

Lemma 2.1 (Lemma W for X2-n)- Suppose that the distribution ofvk, k = 1, . . . , N satisfies 

(2.4) Prob(t;fc G [a, 6]) < - a], 
for any interval [a,b], with k a finite constant. Then 

(2.5) Vioh{\gN{i,J]X)\>t\{vk)k^i,j, {tk)) < -, 
so, in particular, 

(2.6) Prob(|^7^(z,j;A)| >t) < ^ 
and 

(2.7) E(|^?^(^,j;A)r) < 

1 — s 

forO<s<l. 

Remark. The l.h.s. of (12. 5p is the conditional probability of the event {|fi'Ar(z, j; A)| > t} at 
specified values of [vk)k^i,j and {tk) — that is the probability conditioned on the S algebra 
generated by these variables. Eq. (12. 5p is a standard estimate from the fractional moment 
analysis of discrete random Schrodinger operators, see [1]. The main point of this result is 
that to bound E (|5'Ar(^, j; A)|*), it is sufficient to average over Vi and Vj. 

The second part of the argument is to establish large fluctuations for gN{i,j', A) — this 
is Lemma F above. In the present context we have 
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Lemma 2.2 (Lemma F for X2-n)- Under the hypotheses of Theorem^ for each < r < 
s < 1 and A G M there is a constant Cr,s;A < oo such that 

(2.8) E(|(7^(z,j;A)r)<exp(-a,.;AK-j|)E(|^?^(^,j;A)n^/% 
for A e [-A, A]. 

Remark. Together Lemmas 12.11 and 12.21 prove Theorem |H 

Proof. Let us fix A for the moment and drop it from the notation: QNihj) = 9N{hj]^)- 
Suppose without loss of generahty that i < j. A prehminary observation is that 



(2.9) 



9Nii,j) = -^i-i(^, j - l)ij-i^7v(j, j), 



which may be estabhshed using the resolvent identity, writing X2-n as a perturbation of the 
corresponding matrix with set equal to zero (which decouples into two distinct blocks). 
Iteration of this identity gives 



(2.10) 

Thus 

(2.11) 



]^5ffe(A;, k)th 

.k=i 



9Nij,j)- 



ln|5'7v(i, j)l 



^ln|tfc| + ^\n\gk{k,k) \ +\n\gN{j,j] 



k=i 



k=i 



suggesting that if either In \tk\ or In \gk{k, k)\ were to exhibit fiuctuations of order one, then 
the variance of In \9N{hj)\ would be of order |i — j| and Lemma [2.21 would follow. However, 
there are substantial correlations between the various terms, making it difficult to proceed 
directly along this line of argument. 

To make a precise analysis, let us consider the random variables 



(2.12) 



1 



Ik 



9kik,k)' 



which are related by a recursion relation 



(2.13) 
with 



7fc 



Vk- X 



\tk~i\' 
Ik-i 



2<k< N, 



(2.14) 71 = vi. 

These identities may be established using the Schur-complement formula. In a similar way, 
the Schur-complement formula may be used to show that 

(2.15) = v,-X- - It.fG.+i = 7, - |t,f 
9n{j,j) 7i-i 
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where Gj+i = (ej+i, {X2-n — A) ^^j+i) with X2-^n the matrix obtained from X2-^n by setting 



h = 0: 



(2.16) 



2:N 



Vj-l 
f* 









f* 



tj+i 



•jj+i 



V 



7 



In particular, G^+i ^s a function of the variables {vk)k=j+i {tk)k=j+i- 

We now make a change of variables Vk ^— > 7fc in our probability space. The Jacobian is 
triangular with ones on the diagonal and therefore has determinant one. Thus 



N 



N 



(2.17) Joint distribution of {'yk)k=i given (t 



k)k=l 



k=2 



k=l 



So 7fc are a chain of variables with nearest neighbor couplings — thinking of as a time 
parameter, {7a;} is a Markov chain. In terms of these variables, we have 



(2.18) 



^^(^,j)=(-i)i''-^'in 



^J,lk 7, - |t,fG,-+i 



where Gj+i may be written as a function of {■yk)k=j {tk)k=j, since ffc = 7fc + A— \tk-i\'^/'yk- 
A useful trick for analyzing fluctuations in this context, inspired by the Dobrushin Shlos- 
man analysis of continuous symmetries in 2D classical statistical mechanics p^, is to couple 
the system to a family of independent identically distributed random variables 02,0(5, ■ ■ ., 
each with absolutely continuous distribution H{ak)dak- For technical reasons, which will 
become apparent below, we introduce ak only for k = 2 mod 3. Let us define 



(2.19) 



fk = e'^'^jk. 



where we take = for A; ^ mod 3. The Jacobian determinant of the transformation 
ilk, ak) ^ {fk, ak) is n 



" SO 



fc=2,5,8,.. 



(2.20) joint distribution of {fk)k=i and (afc)fc=i, given {tk)k=i 

fk-l 



n + ^ 



)p(e-"^/. + A + ^^^)p(/.+i + A + e"^^)if(a,)e-"^ 



k=2 mod 3 



fi 



k-2 



Jtk^ 
Ik 

X d/fc_id/fcd/fc+idafc. 



with the convention that to = 0. 

We now fix {fk)k=ii and consider the conditional distribution of {ak)k=i, which carries 
some information on the distribution of {'^k)k=i- ^ key point is that the variables ak remain 
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independent after conditioning. They are, however, no longer identically distributed. Instead, 
(2.21) distribution of given (t^)^! and {ft)^^i 



ple-^'^/fc + A + ^f^)pUk+i + A + e"^ \l>^)H{ak)e 

Jk — l J k 



■ak 



-dai 



with 



(2.22) Z, = [ p{e-y, + A + %i^)p(/,+i + e"^)i7(a)e-"d«. 

J Jk~l Jk 

We now express QN{iyj) in terms of the variables (ti, ff , cti")^]^. 



(2.23) 

where 
(2.24) 



n 



n 



H 



.k=2 mod 3 J Lk= 
i<k<j-l 



tk_ 

fk 



H. 



is a function of (t^, ae)^^,. By the conditional independence of (a^) we find that 



(2.25) E{\gr,{z,j)r\{teJe)l^, (a.)f=,) 



0-1 



n^i^.+i n E (e-^ I (t„ /,)-,). 



\.k=i 



:\h\ 



k=2 mod 3 

«<fc<i-i 



Applying propostion [3] to each factor E (e^'^Ktf, fi)f=^ on the right hand side, we find that 
(2.26) E(|(7^(z,j)r|(t£,/.)f=i, («.)f=,) 



0-1 



AT \r-/s 



^k=i 



\fk\ 



k=2 mod 3 
i<'c<i-l 



with 
(2.27) 
and 
(2.28) 



1 

hkir, s) = - min(r, q){s - max(r, q)) Vaig{ak\iti, fe)i^^)dq, 
■5 Jo 

E((a,-m)V"'=|(t,,/,)ii) 



Varg(afc|(t£, /^)£^J = inf 



E(e5"H(^£ ^-^^ 



Using the conditional independence of (a^) once again to reassemble gwihj) inside the 
expectation on the r.h.s., we find that 

(2.29) E{\g^{z,j)n{t,,fe)l„ («.)f=,) 

= e-^i-^>'=(^'^)E(|^7^(z,j)n(t.,/.)f=i, 
where we have set hk{r, s) = for k ^ 2 mod 3. 
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After averaging and applying the Holder inequality, we conclude that 

(2.30) E{Mz,j)n < E(e-^SS'^.(M)'^E(|^?;v(*,j)IT^^ 

Eq. (12.301) is the key result. The exponent in the first factor is a sum of 0{N) non- negative 
terms, each presumably 0(1) and positive with positive probability. It will not be so sur- 
prising to find that the term itself is 0{N) with good probability. The rest is estimates. 

To proceed with the estimates, let us take the a priori distribution of ak, before coupling 
and conditioning, to be uniform in an interval [—rj, r]] centered at the origin: 

(2.31) H{a) = ^I[\a\<r]], 

with rj to be chosen below. Although Varg(afc|(t£, fi)f=i) is defined as a function of (t^, fe)f=i, 



=1- 



it is useful to express it in terms of the variables (t^, 7^, 

r (a -m)V9^i)"z/fc(a)da 

(2.32) Var,(a,|(t„ /,)!,) = inf /-''\, 

with 

(2.33) Ma) = P(e"^-°7fc + A + ^^)p(7fc+i + A + e"""'' 



Ik-i Ik 
A lower bound for Varq(afc|(t^, fi)f^i), sufficient for our purposes, is 

(2.34) Var,(a,|(t,, /,),!,) > 1 -%-i|y MzZ^W^. 

3 sup_^<„<^i/fc(a) 

The r.h.s. still carries some dependence on ak, through the density z/^. We may eliminate 
the dependence on ak entirely by bounding the right hand side from below: 



(2.35) Var,(afc|(t,,/,)f^J 



1 , , p(e-°7fc + A + ^^)p(7fe+i + A + e°^) 



3 -2,<a,p<2, p(e-/5^, + A + ^)p(7fc+i + A + e/^^) ' 



7fc-i " ^ ' ^ 7fe 

It is useful to write 

e-"7fc + A + ^^ = {e-^-lhk + Vk, 
Ik-i 

and similarly for the term in the denominator and the term with index A; + 1. Finally, the 
r.h.s. is no larger if we factor the infimum on the right hand side, 

(2.36) \Bi,{ak\{ttJdti) 

> 1 inf + (^"" - 1)7.) P(^.+i + (e"-l)^) 

3 -2'n<a,p<2'n p{vk + (e-/^ - 1)7^) -2r,<a,/3<2r, p(y^^^ + _ 1)1^) 

On the r.h.s., the only dependence on q is in the exponential term. In the integral fl2.27p . 
there is not much loss in replacing this exponential by the (smaller) e~^'*~^'^, so that 

(2.37) ^hk{r,s) > !:^r^\-'\^~'\nUk{v), 

s — r D 
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with 

(2.38) £/.(,)= M + „.f + (^"-1)^) 



-2rj<a,(3<2v p{Vk + (e"^ - l)7fc) -2r,<a,/3<2r, p(vk+l + (e^^ - 1)^) 

Plugging this estimate into eq. (12.301) . we obtain 

(2.39) E{\gr,{z,m < E (e"^''^-"^'^"' ^.C^)) E (|^^.(^, j) 
Since p is fluctuation regular, there are 6,e > and a set f2 C M with 

/ pdx = Qq > 
Jq 

such that Uk{i]) > where I[Ak] is the indicator function of the event: 

(2.40) A,= !^v„v,^,en, |7.|<^5^, and < ■ 

In turn, since 'jk = Vk + X + \tk-i\'^ / 'jk-i and |A| < A (by assumption), we see that 

(2.41) AkD{vkeQ , \vk\ <L}n K+i eQ}n {\tk-i\, \tk\ < r} 



with r and L any positive numbers. 

We estimate the probability of from below by integrating eq. (12.411) over v^+i, Vk, Vk_i, 
tk, and tk-i in that order. (The need to integrate over three consecutive v variables is the 
reason we introduced ak only for k = 2 mod 3.) To begin, 

(2.42) Prob(t;fc+i G n\{vi)i^k, (U)) = [ p{v)dv = go- 

Jn 

Looking now at Vk, since 7fc = f fc + A + |t/c_ip/7fc_i, we see that 

(2.43) K e , \vk\ < L}r]\-^ < ^ 



|7fc| - r^e^'^-l 

r e^'' — 1 e^^ — 1 

{vk e n} n {\vk\ < L} n <Vk ^ [a - — - — , a + — - — ; 



with a = A + |tA:-iP/7A:-i- Since the density p is bounded, it follows that 



(2.44) Prob [ Vk e Q , \vk\ < L , < 



|7fc| - r2e2''-l 

Similarly 



2'q 

>9„-Prob(|i.j| >L)-2||p||^T•^— 



(2.45) P.ob|^<l(-5^-L-A 



{Vi)i^k-l,k,k+l, {tl) 



>l-2||p|Lr2 
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Combining these estimates with eq. (I2.4ip and integrating over the identically distributed 
variables tk and tyt_i, we find 



(2.46) 



Fioh{Ak\{vi)i^k-i,k,k+u {UWk,k~i) > go ( go - Prob(|t;fc| > L) - 2 \\p\\^r'^ 



X 1 - 2poor^ 



e2'7-l 



L-A 



Prob(|4| <r)2. 



The key things to observe is that the r.h.s. of eq. fl4.28p is independent of k and can be made 
arbitrarily close to by suitable choice of large L, t and small rj. 

So, for sufficiently small r] we have PToh{Ak\{vi)i^k-i,k,k+i, {ti)ijLk,k^i)) > ^Qo, say. Since 
Ukiv) > 5'^I[Ak], we find that 



s — r 



2s 



,2 / T ^_52^2rsg-2|.-l|^ 



? — 1 (see Lemma lA.ll below). Combined 

□ 



(2.47) E(e-T'''--'^-^"'EilJ^.Wj ^ < exp 

by integrating successively over Vk, tk from k = i, . . . 
with (12.391) this completes the proof of Lemma 12.21 

3. Band matrices 

To translate the argument of the previous section to the context of band matrices, we 
replace each of the variables Vj and tj hj WxW matrices. Given G N, consider a sequence, 
Vj, n = 1, . . ., of independent identically distributed hermitian WxW matrices together 
with a sequence, T^, n = 1, . . ., of independent identically distributed WxW matrices (not 
necessarily hermitian). With these matrix variables, we form an infinite random hermitian 
band matrix 

TI V2 T2 



(3.1) 



X- 



w 



V. 



\ 



■ J 



a random operator on £^(N), and for each N the random matrix 
(3.2) Xw-N = QnXwQn 

with the projection onto ^^({1, . . . , N}). For simplicity, let us consider only N a multiple 
oiW: N = nW. Thus, 



(3.3) 



X 



( Vi Ti 
Tl V2 T2 •• 

t\ v. 



W:nW 



\ 







Tl, 
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Let Pj denote the projection onto the j^^ block, — l)W + 1, . . . ,jW}), so 

Vj = PjXy\/Pj and Tj = PjXwrPj+i- 

Band matrix ensembles such as the Gaussian band ensemble (ll.ip are of this form, with 
Tj lower triangular matrices. However, for the argument presented below it is not necessary 
that Tj be lower triangular. (Also, neither strict independence nor identicality of distribution 
are needed. Nonetheless, to keep things simple, let us stick to the i.i.d. case.) 

In adapting the arguments from the scalar case to the matrix variables Vj,Tj, we must 
account for the non-commutativity of the matrix product. The basis of the argument is a 
change of variables Vj ^ e^^Tj with aj a scalar random variable and Tj a W x W matrix 
obtained from the resolvent of Xw-jw- In the end we will need to estimate the ratio 

p{V, + (e-" - i)r,) 
p{V, + {e-p - i)r,) 

for small a, P, where p is the density of the distribution of Vj (assumed to be absolutely 
continuous with respect to Lebesgue measure on some vector space of matrices). In the 
scalar case, this change of variables was useful for all fluctuation regular densities. In the 
matrix case, an additional complication arises. Unless Tj falls in the vector space supporting 
the distribution of Vj there will be constraints on the matrix elements of Tj which manifest 
themselves as S functions after the change of variables. However, Tj is formed from {V^} and 
{Tfc} via non-linear operations, so there is no reason to expect it to fall in this vector space. 
(For example when Vj are diagonal, Tj will in general have off-diagonal components.) To 
guarantee closure under non-linear operations we suppose that the vector space supporting 
the distribution of Vj is a matrix algebra: 

Definition 2. A * algebra over M. of W x W matrices is a set ^ of x matrices that 
is a vector space over M, under the usual addition and scalar multiplication, and such that 

Vi, V2eA^ V1V2 G A and V^ G A. 

We will use 

Proposition 5. If A is a matrix -k algebra over M and V ^ A is invertible then V^^ G A. 

Proof. This is a standard result for C* algebras. In that context, the algebra is usually 
assumed to be a vector space over C, but that is not necessary. Here is the proof. If G ^ 
is self-adjoint and invertible, by the Weierstrass theorem we can approximate V~^ (in the 
operator norm, say) by polynomials in V with real coefficients. That is, we can approximate 
V~^ by elements of A. Since a finite dimensional vector space is complete, V~^ G A. For 
general invertible G we have V~^ = (V"^V^)^^V^ G A, since V'^V G ^ is self adjoint. □ 

Assumption 1. Let S be an increasing sequence of integers and fix, for each W & S, a 
-k algebra over M. of W x W matrices Aw, and the set Tw of matrices which preserve Aw 
under conjugations 

(3.4) Tw = {T : T^AwTcAw}. 

Let A^ = {V E Aw '■ V = V^^ , the set of hermitian elements of Aw- We require that 
Tj G Tw and Vj G Aw, j = I, 

Remark. Note that Tw is closed under conjugation: T G Tw =^ T'^ G Tw- 
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There is a good deal of flexibility in the choice of algebras. Of course, we may take 
Aw = Tw = all n X n complex matrices, so Xy/.^]^ is complex Hermitian. On the other 
hand, we could restrict Aw to be the set of matrices with real entries, so Xw-n is real 
symmetric. In this case Aw is not a complex vector space. Similarly we could take Aw to 
be the set of matrices with quaternion entries, where the quaternions units are represented 
by 2 X 2 matrices, so Xw;N would by Hermitian but anti-symmetric under transposition 
^w-N = —Xw;N- In this last case, S would be the set of even integers. 

An important consequence of assuming that Tj G Tw and Vj G A^, is that we have some 
a priori information on the block matrices making up the resolvent of Xw-nW- 



Lemma 3.1. Suppose Y is an nW x nW matrix that is block tri- diagonal, 



PY P 



if\i-j\>2, 



PjYPj G Awi 



n 



P,YP,+, = (P,+iFP,)t G T, 



w, 



n 



1. 



— 1 



and satisfies 
and 

IfY is invertible then 

(3.5) l^jY eAw. J = i,...,n 
and 

(3.6) P^Y-^P.eATw, 2,j = l,...,n, 
where ATw is the algebra generated by Aw and Tw- 

Remark. The off diagonal blocks of Y^^ need not be in Aw- This is apparent already for 
n = 2, where, by the Schur complement formula, 

PiY-^P2 = {V^-T^V^^tI)-^T^V^^ 

In each expression on the right, the first and last factors are in Aw but the middle factor, 
Ti, is not. 

Proof- The proof is by induction on n. The result is clear for n = 1. So, suppose we know 
that it holds if y is a tridiagonal block matrix of size no larger than (n — 1)W x (n — 1)W- 
First consider (13.51) . By the Schur complement formula. 



where 



v 



T 



t 

n-l 



Tn-l 
Vn J 



Y_ 



Tl 

V 



T. 



As F+ and Y_ are of size no larger than (n — l)W^x {n — \)W and Tj, T^+i G Tw-, it follows 
that 

By Fiop-E PjY-^Pj e Aw- 
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Now consider (13. 61) . Suppose i < j (the other case is similar). Let 
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Y 



\ 



By the resolvent identity, one has 

PiY ^Pj = —PiY ^ Pj-iTjPjY ^Pj. 

But PiY~^Pj-i G ATw by the induction hypothesis and PjY^^Pj G Aw as we have just 
shown. It follows that the r.h.s. is in ATw D 

We now consider the properties required of the distribution of Vj, denoted P^^. Let ||-|| 
denote the operator norm of a matrix 



(3.7) 



sup ||Av|| 

l|v|| = l 



max|(T(A)| \\A 



-11 



and let (t{A) denote the set of eigenvalues of a matrix. Recall, if A is self-adjoint, that 

1 

min |ct(A)| 

Assumption 2. Let {^w)w£S be a family of probability measures such that 

• {Absolute continuity): Each measure ¥\y is supported on A^ and absolutely contin- 
uous with respect to Lebesgue measure on that space. Let pw(y) denote the density 
of Piy with respect to Lebesgue measure. 

• [Wegner-type estimates): There are k > and a > such that for all A G Ay[r, 
W eS, 



(3.8) 



(3.9) 



Fw{V : \\{V - A)-'^\\ > tW^+^} < Kj; 
and for all A, B e A^ and C G AT w,WeS, 



Vi-A C 
Ct V2-B 



> tw 



1+0- 



< 2k-. 

t 



{Fluctuation regularity with hounded tails): There are constants Pq, 5, e > 0, L, a, C > 
such that, for each W & S, there is Vtw C A^/ with ^wiS^w) > Vo aiid if V G Vtw, 
then 



(3.10) 



3.11 



and 



emarks. Pwiy2 
1) Since 



i( > (5 for all Vi, V2 G A^ with \\Vj - V\\ < tW'^, 3 = 1, 2. 



\iv-xi)-'\ 



1 



dist(A,a(V^))^ 
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the Wegner-type estimate (13.81) implies 

(3.12) Fw{v : dist(A,a(V))<:^} < Ke. 

If V is suitably scaled so as to have mean eigenvalue spacing of order 1/W, this 
suggests that we should be able to take a = 0. That has not been proved, however, for 
the random matrix ensembles studied here. For Wigner type matrices, in particular 
for the Gaussian band ensemble (11. ip . we will obtain the estimates (13.81 13. 9p with 
a = i in gg. 

(2) The parameters a and a are not independent. If we rescale via V i— >■ W^V this 
results in a shift a i-^ a — 7 and a 1— »• a + 7. Nonetheless it is convenient to keep both 
parameters since the natural scaling of V is to choose the eigenvalue spacing to be 
of order 1/W. This typically leads to a = 0, but if the entries of V have heavy tails 
then one may have a > 0. 

We require very little from the distribution of Tj, denoted Qw, essentially just a uniform 
(in W) bound on the tails: 

Assumption 3. Let {Qw)w=2 be a family of probability measures, with Qw supported on 
7w- Suppose that there are qo,T > and 6 > such that 

(3.13) Qw{T : ||T|| < rW^^} > go- 

Remark. Qw could be supported on a single point, in which case Tj would be a constant 
sequence. For instance, we could take Tj = I. 

Lemma W for Xw,nw follows easily from part (2) of assumption 1. 

Lemma 3.2 (Lemma W for Xw;nw)- Let Vj, Tj, j = 1,..., be mutually independent se- 
quences of independent random W x W matrices. Suppose each Vj has distribution Fw and 
each Tj has distribution Qw ■ Then, for each A G M, 

Prob [A is an eigenvalue of Xw-nw] = 

and 

(3.14) Prob {\\n{Xw-nW - A/)-'P,|| > | {T^^'l and {V,},^i,j) < 2kj 
for any 1 < i, j < n. 

Proof. Let us first consider the case i = j. The Schur complement formula shows that 

Pi{Xw;nw - A/)-^P. = (V^ -XI + K)-' 

with 

K = tU{X^ - XI)-'T,_, + T,(X+ - Xiy'T^ 

with X_ and X^ the restrictions of X to the blocks above and below i. By Lemma 13.11 
K G Aw- (Note that it is self adjoint.) It follows from (13. 8p that A is an eigenvalue of 
Xw;nw with probability and that (I3.14p holds for i = j. 

The argument for i ^ j is similar. In this case, we first estimate 

As above, we have 

(P, + P,){Xw;nW - \I)-\P^ + Pj) = 



fV,~XI \,(AC\ 
[ V,-Xl) + [C^ B) 
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where A, B and C are formed from blocks of the resolvents of restrictions of X\Y.nW- One 
may verify that A,Be and C G ATw Thus the result follows from (13.91) . □ 

It follows that 

(3.15) E {\\P^{Xw;nW - )^ir'Pj\n < ^W'-'^-^^ 
and so 

(3.16) E(|(v,P,(XH.;niy- A/)-ip,w)|^) < 

for any two vectors v, w. (See (11.81) .) 
Lemma F in this context is as follows: 

Lemma 3.3 (Lemma F for X^r■nw)• Let Vj, Tj, j = 1, . . ., be mutually independent sequences 
of independent random W x W matrices. Suppose each Vj has distribution ¥\y and each Tj 
has distribution Qw Let denote the real dimension of A^r■ Fix a positive number v 
large enough that "m:^-^ D^^W~^ < oo and suppose also that ly > ( + max(a, 1 + a + 2b), 
with a, a, ( as in assumption 2 and b as in assumption 3. Then for each < r < s < 1 and 
< A < oo there is Cr,s > such that if \i — j\ > 3 then 

(3.17) E{[<l>{P,{Xw;nW-)^r'P,)Y) 

< exp {-Cr,sW-^''\i-j\) E ([$ {P,iXw;nw - \I)-'Pj)yy^' 

for any A G [—A, A] and any non-negative, positive-homogeneous function $ : AT-iy ^ M — 

I.e., ^(Y) > and ^(aY) = a^(Y) for a>0. 
Remarks. 

(1) Below we will apply the result with $(F) a semi-norm such as the the absolute value 
of a matrix element | (v, Yw) \ or the norm . However the proof does not make use 
of the triangle inequality, so the result also applies, for example, to ^(Y) = spectral 
radius (Y) or ^{Y) = smallest singular value of $. 

(2) Under rescaling of the matrix elements Xw,nw ^ W'^Xw-nW the localization length 
l/Cr,sW~'^'^ should not change. That this is indeed so follows since C C ~ 7^ 
a a + 7, (T I— —7 and 6 1— 6 + 7, so the combination ( + max(a, 1 + cr + 2b) is 
invariant under rescaling. 

Combining Lemma [3.31 and (13.161) we have 

Theorem 6. Let Vj, Tj, j = 1, . . ., be mutually independent sequences of independent ran- 
dom W X W matrices. Suppose each Vj has distribution Fw and each Tj has distribution 
Qw- Let Dw denote the real dimension of Ay^. Fix a positive number v large enough that 
DwW~" < 00 and suppose also that i' > ( + max(a, 1 + a + 2b), with cr,a,( as in 
assumption 2 and b as in assumption 3. For < t < 1 let 

(3.18) M{W,t)= max E (| (e„ (X^/;^ - A)-ie,) |*) , 

l<x,y<nW 

where and e.y denote elementary basis vectors. Then 

(3.19) M(Ty,t) < 

and given < s < t there are constants C, fi such that for any 1 < x,y < nW 

(3.20) E (I (e,, {Xw;N - X)''ey) \') < CM{W, t)«Ae-^^-'''-'l--J'l. 
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Proof. This amounts to special cases of fl3.16p and Lemma [331 The exponent 2v+l appears 
in fl3.20p because the difference \i — j\ of the blocks to which x and y belong is estimated 
by \x — y\/W . The constant C compensates for the exponential factor e~^^ " when 
\x — y\ is smaller than 3Vr, in which case the estiamte of Lemma [3.31 does not hold. □ 

Remark. Putting ( ]3.20p and ( 13.19^ together we have 

(3.21) E (I (e,, {Xw,N - \y^Gy) < const.iy(^+'^)^e-'^^""'"'l"-^l. 

If the diagonal blocks Vj are Wigner matrices, as in assumption 4 in ^ below, one may 
obtain the estimate 

(3.22) M{W,t) <^^^W'^, 

1 t 

resulting in a very slight improvement on the estimate on the r.h.s. of ( 13.2ip . 

(3.23) E (I (e„ {Xw;N - A)-'e^) |") < const.iyte-'^^"^"'!-?'!. 

This improvement is not very significant, as the main point here is the exponential factor, 
which dominates any power of W as long as \x — y\ » W'^^^^ . 

4. Fluctuations 

We now prove Lemma 13. 3[ Following the proof of Lemma 12. 2[ let us fix A and set 

Gnihj) = Pi{Xw;nW — A/) ^Pj. 

Since Gn{i,jy = Gn{j,i), in estimating ||G'„(i,j)|| we may assume without loss that i < j. 
We have, by the resolvent identity, 

(4.1) Gn{^,J) = -G,-i{i,3-l)T,^iGnU,3)- 
Iteration gives 

(4.2) G^(z,j) = (-l)^-*G',(^,2)T,G,+i(^ + l,z + l)T,+i---G,_i(i-l,i-l)T,_iG„(j,j)- 
Let us define x ly random matrices 

(4.3) Tk = Gu{k,k)-\ 
related by a recursion relation 

(4.4) r, = v,-\i-Tl,v,\n_,. 

As in the W = 2 case, these identities may be established using the Schur-complement 
formula — compare with fl2.9p and fl2.13p . Similarly, 

(4.5) g4j,j)-' = V, - A/ - r/_irT_\r,_i - t^g,+^t, = r, - t,g,+^t] 

where Gj+i = Pj^i{Xw,nW — ^)^^Pj+i with Xw,nw the matrix obtained from X^y.^w by 
setting Tj = 0. Thus Gj+i is a function of the matrix variables (yk)^^j_^_l and (Tfc)^^_,_^. 

We now make the change of variables \4 t— > F^ in our probability space. By Lem. 13.11 and 
Prop. [5l Ffc e A^r- As in the tri-diagonal case, the Jacobian determinant is 1, so 

(4.6) Joint distribution of (Ffc)fc=i given {Tk)lzl 

n n 

= p(Fi + A/) J] p(Ffe + AJ + Tl_,Tu-iTk^i) J] dr^. 
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where dr^. denotes Lebesgue measure on A^r■ In terms of the matrices F^, we have 

(4.7) j) = (-l)l^-^lrriT,rr^\T,+i ■ ■ -Tjl.T, ■ {T, - Tfi,+,T])-\ 

where Gj+i is a function of {Tk)1^j and {Tk)1^j (since Vk = Tk + \I - Tl_^T'^\Tk-i) . 

The matrix product in fl4.7p is non-commutative, so it is not clear if the heuristic analysis 
that the "log of G is a sum of terms with only local correlations" is valid. Nonetheless, we may 
use the trick employed above of coupling the system to a family of independent identically 
distributed scalar variables 0^2, as, . . ., each with absolutely continuous distribution 

(4.8) H{ak)dak = ^/[|afc| < v]dak, 

with ?7 > to be chosen below. We define 

(4.9) Fk = e'^'^Tk, 

where we take = for k ^ 2 mod 3. The Jacobian of the transformation (Tkyak) ^ 
{Fk, ak) is 5 8 e~'^^"'=, where Dw = dim^^ is the dimension of A^. Thus 

(4.10) joint distribution of {Fk)k=i and (0;^)^^^, given (Tfc)^~J = 

n 

n p{Fk-i + XI + Ti,F-\Tk.2)p{e-'''Fk + XI + Tl,F-_\Tk-,) 



k=2 mod 3 



X p{Fk+i + XI + e'^'^TlF-'Tk) H{ak)e~''^'''=dFk-idFkdFk+ida 



with the convention that Tq = 0. 

As in the tri-diagonal case, the variables ak remain independent after conditioning on 
{Fk)tv Also, the 

(4.11) distribution of ak given (T^)"^-,^ and (-F^)"^! = 

p(e-.Ffc + XI + Tl_^F^\Tk-i)p{Fk+i + XI + e'^'^TlF-^Tk)H{ak)e-''^'''^ ^ 
^ dak 

^k 

with 

(4.12) Zk = ^ r p{e-''Fk + XI + Tl_,F-\Tk-i)p{Fk+i + XI + e''TlF-'Tk)e-''^'^da. 

Now fix a non-negative positive homogeneous $ as in the statement of the Lemma. Re- 
placing Tj in (14.71) by e~°'^Fj, we find that 

(4.13) HGn{t,j)) = n ^"'"^ f (-1)'^"'' iii^k'n] %i 



k=2 mod 3 \ \k=i 

i<k<j-l 



where 
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is a function of (T^, F^, ak)k=j- Since (afc) are conditionally independent, it follows that 
(4.15) E([<|.(G„(z,j))]nm,F,)ti, Mt,) 



j+1 



v fe=i 



n E(e'-"'=|(r,,F,)t,)- 



k=2 mod 3 



By propostion [3] and the Holder inequality, we conclude that (compare with fl2.30p ): 

(4.16) E([<|.(G„(z,j))]'') < E(e-^SS'^.(M)'^E([$(G„(2,j))]^)'/% 
where for k = 2 mod 3 

(4.17) hk{r,s) = - mm{r,q){s - max{r,q))VaTg{ak\{Ti, Fi)^^^)dq, 

Jo 

with Var^ as in fl2.28p . and we have set hk{r, s) = for ^ 2 mod 3. 
Let us express Varq(afc|(T^, F^)^^) in terms of (T^, F^, af)^i: 

(4.18) 



(a-m)2e(''-^^)"i/fc(a)da 
Var,(a,|(T„F,),^,) = i^f^ ' e(^-^^)^.,(a)da 



with 



(4.19) z/fc(«) = p(e"'=-"rfe + AJ + T;_ir-^iTfc_i)p(rfc+i + A/ + e"-"^r;r-iTfc). 
Thus (compare with fl2.36p ). 

(4.20) Var,(a.|(T,,F,)t,) > ^ ^e^-e-^-- inf ^^ll^ ! ~ 

3 -2r,<Q,/3<2r, p(l4 + (e ^ - l)rfc) 



pjv,^, + (e" - i)rtr, ^r. 



inf 



With fl4.16p this implies 

(4.21) EmGnit,j))Y) < E(e-^'''^'"'''^""""'^'^^'^^('') 
where 



E([$(G„(^,j))]' 



\r/s 



(4 22) f/J.)= inf + (^'" - m) P(V..-1 + (e" - l)rtr,-^T,) 

-2v<a,0<2r, p{Vk + (c'^^ - l)Tk) -2„<a,/3<2„ p^Vk+l + (c^^ - l)T2r^^Tfc) ' 

By fluctuation regularity of Fw, we have Uk{i]) > 5^/[y4fe] where I[Ak] is the indicator 
function of the event: 



(4.23) Ak = \ Vk, Vk+i e n 



32r, _ I 



W-^, and 



^k^ k 



< 



_ I 
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with 6,e > 0, ( > and Qw as in assumption (3). In turn, since = Vk + X + Tl_-J?i^\Tk-i, 
we see that 



(4.24) Ak D {Vk+i e Qw} n {Vk G Qw} n {||T,_i|| , ||r,|| < rW'} 

1 



n<!l|r.M|< 



r2 V - 1 



W-'^ -LW -\X\ W 



-2b 



n /llr^^ll < - - ly-^b-cl 



with L, a > as in assumption 2 and r, 6 > as in assumption 3. This allows us to estimate 
the probability of from below by successively integrating over V^+i, Vk, Vk-i, T^, and 
Tk-i in that order. To begin, by assumption 2, 

(4.25) Prob(Vfe+i G nw\{Vi)i^k, {Ti)) = I'wi^w) > > 0. 

Since = + AJ + T^.^r^j'^j^T^.i, we see from the Wegner estimate (13.81) that 



(4.26) Prob [v, G ^Iw , \\T,'\\ < ^2^^^''' 



l^k,k+l, 

>Po- nr 



p2r? _ 1 
2^ l]4/l+<^+C+26 



Similarly 



(4.27) Prob(^||r-\|| <l(^.^±-W-'^-LW''-\X\^W- 



2b 



l^k-l,k,k+lj (Ti) 

1 



■W 



1+0-+26 



e2.;_i 



Combining these estimates and using assumption 3 to integrate over and T^-i, we find 



(4.28) PToh{Ak\iVi)i^k-i,k,k+i, {Ti)i^k., 



k-l) 



> 



1 



W-< - LW^ - \X 



■w 



l+cr+2b 



e2f-l 



Taking rj = cW~'^ with v > max(a, 2b + a + 1) + (, we may choose c sufficiently small to 
make the r.h.s. larger than ^q^pl, say. Since Uk{ri) > 6^I[Ak] we find, integrating successively 
over Vk, Tfc from = i, . . . , j — 1 (see Lemma [A. ip . that 



(4.29) E(e-T'''«-^''--"-''EC:c/.W 



< exp 



s — r 
2s 



h - Jl 



Increasing u, if necessary, so that sup^^-Dv^W^ < oo completes the proof. 



□ 
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5. Ensembles 

In this section, we consider several examples of band matrix ensembles satisfying assump- 
tions 1,2, and 3 of section [21 Assumption 1 is simply the choice of an algebra Aw to support 
the distribution of the diagonal blocks, and the corresponding set T\y for the off-diagonal 
blocks. In this regard, we will consider two cases: 

(M) Aw = W X W matrices with real entries, 
or 

(C) Aw = W X W matrices with complex entries. 
In each case the dimension of the algebra Dw is comparable to W"^ and Tw = Aw- 

5.1. Wigner-matrix blocks and the Wegner estimate. We shall suppose that the di- 
agonal blocks Vj of Xw-N are Wigner matrices: 

Assumption 4. The distribution of the diagonal blocks, dPvK(V^), written in terms of the 
matrix elements 

/ di ai^2 ■ ■ ■ ■ ■ ■ CLi,w \ 
d2 '■■ 



(5.1) 

has the form 
(5.2) 



V 



1 



''1,2 



a 



0'W-l,W 

w-i,w dw ) 



w 



dFwiV) = l[h{dj)ddj H giai,,)dc 

j=l ^<i<j<W 

where ddj is Lebesgue measure on the real line, h G L°°(M) fl L^(M) is non-negative with 
J h = 1, and either 

(M) dttij is Lebesgue measure on M and g G L°°(R) nL^(R) is non- negative with J^g = 1, 
or 

(C) duij is Lebesgue measure on C and g G L°°(C) nL^(C) is non-negative with f^g = 1. 
Furthermore, we require 



(5.3) 



[ \^h{X)d\ < oo, 
Jr 



)da < oo , and / ag{a)da = 



(5.4) 

Clearly the measure dPw is absolutely continuous with respect to Lebesgue measure on 
Aw — this is part 1 of assumption 2. Regarding the Wegner estimates — part 2 of 
assumption 2 — we then have the following 



Theorem 7 (Wegner estimate). Under assumption 4, the Wegner estimates (13.81) and (13.91) 
hold with cy = \ and k, = 2ti ess-sup h{X). 

Proof. This result, which is obtained by averaging over the diagonal variables {dj} only, 
is a standard estimate from the theory of random Schrodinger operators, first obtained by 
Wegner [20]. For completeness, we sketch the proof. 
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Note that ||(\^ — A) ""^H > t if and only if V — A has an eigenvalue in the interval (—7, 7). 



It follows that 



(5.5) Prob{||(V- A)-i|| > t} < -E I tr 



iV-Ar + -, 



t2 



-1' 



t 



-E Imtr 



V-A 



t 



-I -1' 



w 



^E Im(e, 



■t=i 



-I -1 



V"- A-i-J 
t 



By the Schur complement formula, 



(5.6) 



V-A-i-I 
t 



7 



where 7 is a function of all matrix elements of V except di. Thus 7 is a random variable 
independent of cij, so 
(5.7) 



E Im ( e, 



V-A 



t 



-1 



E 



J + Im7 



[^d,-Re^f + {\ + \m^y 



— II II 00 ' 



where the inequality follows from replacing the average J •h{di)ddi by the upper bound 
II ^11 00 /m Summing over i gives the result. 

The proof of 03.91) is analogous. However in that case the trace is over a 2W dimensional 
space, resulting in the additional factor of 2 on the r.h.s. of that equation. □ 



The scaling factor y/W that appears in (15.111) is natural, as with this scaling the matrix 
V has a finite density of states in the large W limit [19] : 

1 /• 1 /•2- 

(5.8) 



W^oo W J aH 



tifiV)dFwiV) 



2(t2 



TT 



/(A)v^ 



AMA, 



-2o- 



with cr^ = J \a\'^g{a)da. A key fact below is the following related result 

Theorem 8 (Bai and Yin [4j). Let V be a W x W random matrix of the form (15. lip , with 
{di} and {aij} mutually independent sets of independent random variables. If 

E {\d,\) < 00, E (|ai,,f ) <oo, E (a,,j) = 0, 

and cr^ = E (|ajjp) then 

(5.9) lim Prob [\\V\\ > 2a + 77] = 0. 

W—fOo 

Remark. This follows from Theorem A of ref. [Ij, which gives the convergence of Ai, the 
largest eigenvalue of V, to 2a with probability one. Symmetrizing the assumptions of Theo- 
rem A and applying the result also to show that Xw, the smallest eigenvalue of V, converges 
to —2a, this result follows. (The proof in |1] is written out in the real symmetric case, but 
carries over to the complex hermitian case with only very minor modifications.) 

Corollary 9. Under assumption 4, we may find po,L > such that 

(5.10) Prob[||V|| < L] < po. 
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We require very little of the off diagonal blocks Tj. They need only satisfy the estimate 
fl3.13p analogous to (13.101) and (15.101) . In particular, they could be deterministic, say Tj = I 
for all j or Tj given by a Toeplitz matrix. In this section we consider a few examples of 
random off-diagonal blocks modeled on the blocks for the Gaussian band ensemble (II. ip . 
In that case, the off-diagonal blocks Tj are lower triangular matrices with Gaussian entries. 
More generally we may suppose 

Assumption 5. The distribution of the off-diagonal blocks, dQw{T), written in terms of 
the matrix elements 



(5.11) 

has the form 
(5.12) 



/ 

^2,1 



t 



WW-1 



0/ 



l<j<i<W 



where either 

(R) /u(tjj) is a probability measure on 



or 



(C) fJ^itij) is a probability measure on C, 



and 



(5.13) 



j |t|M/i(t) < cx) , and j M/i(t) = 0. 



Theorem 10. Under assumption 5, we may find qo,T > such that assumption 3 holds with 
6 = 0, i.e., 

(5.14) Prob[||T|| < r] < go- 

Proof. It follows from [4| Theorem A] that, with = j |tpd/i(t), 

(5.15) lim Prob niT + T"^ll XT + nl = 0, lim Prob nii(T - T"^) 11 Xx + r?l = 0, 



for any 7] > 0. Since 
(5.16) 

it follows that 
(5.17) 

Thus (I57[4D holds. 



T = l(T + Tt) + li(T-Tt), 



lim Prob [||T|| >a + r?] = 0. 



□ 
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5.2. Fluctuation regularity. A particular example of distributions satisfying assumption 4 
are the Gaussian Orthogonal Ensemble (GOE), corresponding to case (R), and the Gaussian 
Unitary Ensemble (GUE), corresponding to case (C). In these cases, the measure P is of the 
form 

(5.18) dF{V) oc e-^'^'^^'dV, V E A^, 
with /5 = 1 (M) or /3 = 2 (C). That is, 

(5.19) h{d) = -^e-^''\ g{a) = —L^e"^^!'^!'. 



'T^ ' ' (2/?7r)2 

Theorem 11. If V is a GUE or GOE matrix of size W then assumption 2 of sectionlB holds 
with a = ^, ( = 2 and a = 0. 

Corollary 12. Assumptions 1, 2, and 3 hold for the Gaussian band ensemble (11. ip . 

Proof. We have already derived the Wegner estimates (Thm. [7j). It remains only to show 
the fluctuation regularity. For the Gaussian ensembles, we have 

(•5.20) ^(^1) = ^-/3Wtr{V-'-Vi) ^ ^-fSW tr{Vi~V2)iVi+V2) y ^-f^W^Vi-ViWWVi+ViW 

P(^2) 

If \\Vi - V\\ , 111/2 - V^ll < eW~^ we have 

Letting po and L be as in Cor. [9l we set VLw := {||l^|| < L}. Then ViohiVLw) > Po > 
and if G Vtw) we have 

(5.22) 4^ > e-2^(^+^) := 5, 

whenever \\Vi - V\\ , - 1^11 < eVT"^- □ 

To obtain fluctuation regularity for general Wigner matrices (13.111) we require additional 
assumptions on h and g. For instance, we have the following 

Theorem 13. If V satisfies assumption 4 with Ink and Ing uniformly Holder continuous 
with exponent a, then assumption 2 of section\^ holds with o" = |;C = ^ + | '^'^^ a = 0. 

Remark. For example h(X) = g(\) = Co.e"''*''" with < a < 1 satisfles the hypotheses of the 
theorem. 

Proof. We have 

(5.23) = exp ^^ln/i(cii;i) -ln/i(cii;2) + ^ In 51 (ai j;i) - In 51 (aij;2)^ 

-C ^ Idi-i - di.2\° + ^^^ Wi,j;2 - aij.2\ 



> 



exp 



If 11^1 - l^ll , \\V2 - V\\ < eW-^, then 

\d^■,l - d^.2l |a,,,;2-a,j;2| < Vw\\Vi-V\\ + Vw\\V2-V\\ < 2tW"^~^ . 
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It follows that 

(5.24) 4^ > exp r-Ceiy^iyt-^l = e"^^ =: 6. 

P(^2) 

This estimate holds for every V, so in particular for all V in Qw = {\\V\\ < L}. □ 

Theorem [T3] cannot apply iihoi g has compact support. Nonetheless compactly supported 
densities can be handled. A general result of this type would somewhat involve to state, so 
let us simply note that assumption 2 holds if h and g are characteristic functions of open 
neighborhoods of the origin. 

Theorem 14. Suppose that V satisfies assumption 4 and that 

h{d) = ^I[\d\<Dl g{a) = -^^I[\a\ < Al 

with ci = 2 and C2 = vr. Then assumption 2 of sectionWi holds with o" = |, C = | and a = 0. 

Proof. Clearly the moment conditions of assumption 4 hold. Thus by Cor. [9] we can find po 
and L so that fl5.10p holds. 

Now suppose ||Vi — V^||,||V2 — V^ll < eW~2. Suppose also that the matrix elements of V 
satisfy W'^di] < W-^D - eW'i, W'^ai^j] < W'^A - eW'i for all z, j. Then 

(6.26) 4^ = 1. 

But 

(5.26) Vioh{W-^di\ < W-^D - eW-^, W'^laijl < W-^A - eW'^) 

> 1 - ^Prob(|rfi| > D -eVr-2) - ^Prob(|aij| > A - eW^'^) > 1 - Ce. 

Now let 
(5.27) 

= \^\\V\\<L,W-^di\<W'^D~eW-^, and W-^aij\ < W'^^ A - eW-^^ 
with e sufficiently small that 

(5.28) Prob(nvK) > Po - C'e > 0. □ 

5.3. Summary. Putting the results of this section together with Thm. [6] we have: 

Theorem 15. . Let Aw = Tw = set of W x W matrices with real or complex entries and 
suppose P and Q satisfy assumptions 4 and 5. 

(1) //P is either the Gaussian orthogonal or Gaussian unitary ensemble, then given r > 
and s G (0, 1) there are Ag < oo and > such that 

(5.29) E{\{ei,{Xw;N-\y'ej)\') < AsWh-"^^-^, Xe[-r,r]. 

In particular, fl5.29p holds for the Gaussian band ensemble (11. ip . 

(2) If Ink and Ing are uniformly Holder continuous with exponent a, then given r > 
and s G (0, 1) there are Ag < oo and > such that 

(5.30) E(|(e,,(Xiy;^-A)-^e,-)|') < Aiyte-°^w, AG[-r,r], 
with II = 5 + - 

I a 
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(3) // h and g are proportional to characteristic functions of open neighborhoods of the 
origin, then given r > and s G (0, 1) there are Ag < oo and > such that (15.301) 
holds with /i = 9. 

Appendix A. A lemma on conditional averages 

In the proofs of the various versions of Lemma F above, a key step was to estimate averages 
of the form 

(Al) E(e-^i-^^) 

in which Uj are non- negative, strictly positive with good probabihty, but not independent. 
The following Lemma gives the relevant estimate, which can be seen as a simple version of 
stochastic domination. As the proof shows, under appropriate assumptions, we can estimate 
(lA.ip in terms of the same expression with Uj replaced by i.i.d. non-negative Bernoulli 
variables taking with probability less than 1. 

Lemma A.l. Let Uj be a sequence of a -algebras of events on a probability space and let Uj 
be a sequence of non-negative random variables with Uj measurable with respect to for 
k ^ i . If for some 5 > 0, 

Prob(C/j > 5|Sj) >po 

for each j , then 

E(^e-Si=i^^) < e-(^-^"')P»". 
Proof. This follows by induction, since 

E (e- = e" ^^=1 ^'E (e"^" < [(1 - po) + e-%]e- 

and 

(l-po)+e-%<e-(i-''~')^'o. □ 
References 

1. M Aizenman and S Molchanov, Localization at large disorder and at extreme energies: An elementary 
derivation, Comm. Math. Phys. 157 (1993), no. 2, 245-278. 

2. M Aizenman, JH Schenker, RM Friedrich, and D Hundertmark, Finite-volume fractional-moment criteria 
for Anderson localization, Comm. Math. Phys. 224 (2001), no. 1, 219-253. 

3. M Aizenman, Localization at weak disorder: some elementary bounds. Rev. Math. Phys. 6 (1994), no. 5A, 
1163-1182, Special issue dedicated to EUiott H. Lieb. 

4. Z D Bai and Y Q Yin, Necessary and sufficient conditions for almost sure convergence of ttie largest 
eigenvalue of a Wigner matrix, Ann. Probab. 16 (1988), no. 4, 1729-1741. 

5. J V Bellissard, P D Hislop, and G Stolz, Correlation estimates in the Anderson model, J. Stat. Phys. 
129 (2007), no. 4, 649-662. 

6. R Carmona and J Lacroix, Spectral theory of random Schrddinger operators. Probability and its AppU- 
cations, Birkhauser Boston Inc., Boston, MA, 1990. 

7. G Casati, L Mohnari, and F Izrailev, Scaling properties of band random matrices, Phys. Rev. Lett. 64 
(1990), no. 16, 1851-1854. 

8. G Chirikov, B V Guarneri, I Izrailev, and F M Casati, Band- random- matrix model for quantum local- 
ization in conservative systems, Phys. Rev. E 48 (1993), no. 3, R1613. 

9. J-M Combes, F Germinet, and A Klein, Generalized eigenvalue- counting estimates for the anderson 
model. Preprint, 2008. 

10. R L Dobrushin and S B Shlosman, Absence of breakdown of continuous symmetry in two-dimensional 
models of statistical physics. Comm. Math. Phys. 42 (1975), no. 1, 31-40. 



30 



J. SCHENKER 



11. F J Dyson, Statistical theory of the energy levels of complex systems. I, J. Math. Phys. 3 (1962), no. 1, 
140. 

12. , Statistical theory of the energy levels of complex systems. II, J. Math. Phys. 3 (1962), no. 1, 

157. 

13. Y V Fyodorov and A D MirUn, Scaling properties of localization in random band matrices: A a-model 
approach, Phys. Rev. Lett. 67 (1991), no. 18, 2405-2409. 

14. G-M Graf and A Vaghi, A remark on the estimate of a determinant by Minami, Lett. Math. Phys. 79 
(2007), no. 1, 17-22. 

15. A Klein and S Molchanov, Simplicity of eigenvalues in the Anderson model, J. Stat. Phys. 122 (2006), 
no. 1, 95-99. 

16. H Kunz and B Souillard, Sur le spectre des operateurs aux differences finies aleatoires, Comm. Math. 
Phys. 78 (1980), 201-246. 

17. N D Mermin and H Wagner, Absence of ferromagnetism or antiferromagnetism in one- or two- 
dimensional isotropic heisenberg models, Phys. Rev. Lett. 17 (1966), no. 22, 1133-1136. 

18. N Minami, Local fluctuation of the spectrum of a multidimensional Anderson tight binding model. Comm. 
Math. Phys. 177 (1996), no. 3, 709-725. 

19. S A Molchanov, L A Pastur, and A M Khorunzhii, Limiting eigenvalue distribution for band random 
matrices, Theor. Math. Phys. 90 (1992), no. 2, 108-118. 

20. F Wegner, Bounds on the density of states in disordered systems, Zeit. Phys. B 44 (1981), no. 1-2, 9-15. 

Michigan State University, East Lansing, Michigan 48824 
E-mail address: j ef f reySmath . msu . edu 



